Distributed Set Expression Cardinality Estimation

نویسندگان

  • Abhinandan Das
  • Sumit Ganguly
  • Minos N. Garofalakis
  • Rajeev Rastogi
چکیده

We consider the problem of estimating set-expression cardinality in a distributed streaming environment where rapid update streams originating at remote sites are continually transmitted to a central processing system. At the core of our algorithmic solutions for answering set-expression cardinality queries are two novel techniques for lowering data communication costs without sacrificing answer precision. Our first technique exploits global knowledge of the distribution of certain frequently occurring stream elements to significantly reduce the transmission of element state information to the central site. Our second technical contribution involves a novel way of capturing the semantics of the input set expression in a boolean logic formula, and using models (of the formula) to determine whether an element state change at a remote site can affect the set expression result. Results of our experimental study with real-life as well as synthetic data sets indicate that our distributed set-expression cardinality estimation algorithms achieve substantial reductions in message traffic compared to naive approaches that provide the same accuracy guarantees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simple and Efficient Estimation Method for Stream Expression Cardinalities

Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression defined over multiple distributed streams is one of the most fundamental queries of interest. Earlier methods based on probabilistic sketches have focused mostly on the sketching algorithms. However, the estimators do not fully utilize the information in the sketches and thus are not statistically effic...

متن کامل

Exact Cardinality Query Optimization for Optimizer Testing

The accuracy of cardinality estimates is crucial for obtaining a good query execution plan. Today‟s optimizers make several simplifying assumptions during cardinality estimation that can lead to large errors and hence poor plans. In a scenario such as query optimizer testing it is very desirable to obtain the “best” plan, i.e., the plan produced when the cardinality of each relevant expression ...

متن کامل

An Efficient Distributed Compressed Sensing Algorithm for Decentralized Sensor Network

We consider the joint sparsity Model 1 (JSM-1) in a decentralized scenario, where a number of sensors are connected through a network and there is no fusion center. A novel algorithm, named distributed compact sensing matrix pursuit (DCSMP), is proposed to exploit the computational and communication capabilities of the sensor nodes. In contrast to the conventional distributed compressed sensing...

متن کامل

Maximum Likelihood Method for RFID Tag Set Cardinality Estimation using Multiple Independent Reader Sessions

In this paper, Radio Frequency IDentification (RFID) tag set cardinality estimation problem is considered under the model of multiple independent reader sessions with unreliable radio communication links in which transmission errors might occur. After the R-th reader session, the number of tags detected in j (j = 1, 2, ..., R) reader sessions is updated, which we call observed evidence. Then, i...

متن کامل

Uniform-in-Bandwidth Nearest-Neighbor Density Estimation

We are concerned with the nonparametric estimation of the density f(·) of a random variable [rv] X ∈ R by the nearest-neighbor [NN] method. The NN estimators are motivated as follows (see, e.g., Fix and Hodges [17]). Let X1,X2, . . . be independent and identically distributed [iid] random copies of X, with distribution function [df] F(x) := P(X ≤ x), for x ∈ R. Denote the empirical df based upo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004